# 45 Gram-Schmidt process and QR factorization.

Date modified: Mon 2024-05-27 . 11:16 AM
Date created: Sun 2024-05-26 . 03:41 PM
# 45 Gram-Schmidt process and QR factorization. 🚧🚧🚧🚧 ## Gram-Schmidt process. We now return to Gram-Schmidt process of producing an orthogonal set, and further an orthonormal set, from a linearly independent set. The procedure goes as follows. > **Gram-Schmidt process.** > Input: Linearly independent vectors $w_{1},w_{2},\ldots,w_{k}$ > Output: An orthogonal set $u_{1},u_{2},\ldots,u_{k}$ where $$ \begin{array}{} \operatorname{span} (u_{1})=\operatorname{span} (w_{1}) \\ \operatorname{span} (u_{1},u_{2}) = \operatorname{span} (w_{1},w_{2}) \\ \operatorname{span} (u_{1},u_{2},u_{3}) = \operatorname{span} (w_{1},w_{2},w_{3}) \\ \vdots \\ \operatorname{span} (u_{1},u_{2}, \ldots u_{k}) \operatorname{span} (w_{1},w_{2},\ldots,w_{k}) \end{array} $$ Procedure: $$ \begin{array}{l} u_{1} = w_{1} \\ u_{2} = w_{2} - \text{proj}_{u_{1}}(w_{2}) \\ u_{3} = w_{3} - \text{proj}_{u_{1}}(w_{3}) - \text{proj}_{u_{2}}(w_{3}) \\ u_{4} = w_{4} - \text{proj}_{u_{1}}(w_{4}) - \text{proj}_{u_{2}}(w_{4}) - \text{proj}_{u_{3}}(w_{4}) \\ \ \ \ \ \ \ \ \vdots \\ \displaystyle u_{k} = w_{k} - \sum_{i=1}^{k-1} \text{proj}_{u_{i}}(w_{k}) \end{array} $$ If further wants an orthonormal set, normal each vector by computing $$ \tilde u_{i} = \frac{u_{i}}{\Vert u_{i}\Vert} $$ The idea is an extension of the 2-dimensional Gram-Schmidt process. We successively subtract off the orthogonal projection of the previously made orthogonal set, what is left is the part orthogonal to each previous vector, while preserving the partial span. **Remark.** Don't forget, the one-dimensional orthogonal projection of a vector $x$ onto a nonzero vector $u$ is $$ \text{proj}_{u}(x) = \frac{u\cdot x}{u\cdot u} u $$ **Remark.** We often need to produce an orthogonal basis for a subspace, and Gram-Schmidt says, well if you have a linearly independent basis, then we can turn it into an orthogonal set. If further you wish it to be orthonormal, then further normalize the orthogonal set that you got. **Remark.** Grams-Schmidt process is actually order dependent. Note which ever is the first vector will not be changed at all, while each following ones we subtract more and more components of the previous components. **Example.** Find an orthonormal basis for $W$, where $$ W = \operatorname{span} \{\begin{pmatrix}1\\1\\1\\2\end{pmatrix},\begin{pmatrix}1\\1\\2\\1\end{pmatrix},\begin{pmatrix}1\\2\\1\\1\end{pmatrix}\} $$ $\blacktriangleright$ We apply Gram-Schmidt on the linearly independent set $$ w_{1} = \begin{pmatrix}1\\1\\1\\2\end{pmatrix},w_{2}=\begin{pmatrix}1\\1\\2\\1\end{pmatrix}, w_{3} = \begin{pmatrix}1\\2\\1\\1\end{pmatrix} $$to get an orthogonal basis set $u_{1},u_{2},u_{3}$ first: $$ \begin{array}{l} u_{1} = w_{1} = \begin{pmatrix}1\\1\\1\\2\end{pmatrix} \\ u_{2} = w_{2} - \text{proj}_{u_{1}}(w_{2}) = \begin{pmatrix}1\\1\\2\\1\end{pmatrix} - \frac{6}{7} \begin{pmatrix}1\\1\\1\\2\end{pmatrix} = \begin{pmatrix} 1 / 7 \\ 1 / 7 \\ 8 / 7 \\ -5 / 7\end{pmatrix} = \frac{1}{7} \begin{pmatrix}1\\1\\8\\-5\end{pmatrix} \\ u_{3} = w_{3} - \text{proj}_{u_{1}}(w_{3}) - \text{proj}_{u_{2}}(w_{3}) \\ \quad=\begin{pmatrix}1\\2\\1\\1\end{pmatrix} - \frac{6}{7} \begin{pmatrix}1\\1\\1\\2\end{pmatrix} - \frac{(1 / 7)6}{((1 / 7)^{2}91} \frac{1}{7} \begin{pmatrix}1\\1\\8\\-5\end{pmatrix} \\ \quad= \begin{pmatrix}1\\2\\1\\1\end{pmatrix} - \frac{6}{7} \begin{pmatrix}1\\1\\1\\2\end{pmatrix} - \frac{6}{91} \begin{pmatrix}1\\1\\8\\-5\end{pmatrix} = \begin{pmatrix}1 / 13 \\ 14 / 13 \\ - 5 / 13 \\ - 5 / 13 \end{pmatrix} \end{array} $$Now we can normalize the orthogonal set to make them orthonormal, so we have an orthonormal basis for $W$: $$ \frac{1}{\sqrt{7}}\begin{pmatrix}1\\1\\1\\2\end{pmatrix}, \frac{1}{\sqrt{91}}\begin{pmatrix}1\\1\\8\\-5\end{pmatrix}, \frac{1}{\sqrt{247}}\begin{pmatrix}1\\14\\-5\\-5\end{pmatrix}. \quad\blacklozenge $$ **Remark.** By cleverly "rescaling", one can avoid a lot of fractions during ones computation in Gram-Schmidt. ## QR-Factorization. Let us look at the Gram-Schmidt process carefully, producing an orthogonal set $$ u_{1},u_{2},\ldots,u_{k} $$from a linearly independent set $$ w_{1},w_{2},\ldots,w_{k} . $$Note that we can rewrite $$ \begin{array}{l} w_{1} = u_{1} \\ w_{2} = u_{2} + \text{proj}_{u_{1}}(w_{2}) \\ w_{3} = u_{3} + \text{proj}_{u_{1}}(w_{3}) +\text{proj}_{u_{2}}(w_{3}) \\ \quad\vdots \end{array} $$And since $\text{proj}_{u}(x)\in \operatorname{span}(u)$, we see that $$ \begin{array}{l} w_{1} \in \operatorname{span} (u_{1}) \\ w_{2} \in \operatorname{span} (u_{1},u_{2}) \\ w_{3} \in \operatorname{span} (u_{1},u_{2},u_{3}) \\ \quad\vdots \end{array} $$ In particular, $w_{k} = u_{k} + \sum_{i=1}^{k-1} c_{i}u_{i}$, for some $c_{i}$. So there is a matrix relation $$ \begin{bmatrix}| & | & & | \\ w_{1} & w_{2} & \cdots & w_{k} \\ | & | & & |\end{bmatrix} = \begin{bmatrix}| & | & & | \\ u_{1} & u_{2} & \cdots & u_{k}\\| & | & & |\end{bmatrix} \begin{bmatrix}1 & \ast & \ast & \cdots & \ast\\ 0 & 1 & \ast & \cdots & \ast\\ 0 & 0 & 1 & \cdots & \ast \\ \vdots & & & \ddots & \vdots \\ & & & & 1\end{bmatrix} $$This gives a factorization result: > **Orthogonal factorization.** > Let $A$ be a matrix with linearly independent columns, $w_{1},w_{2},\ldots,w_{k}$. Then $A$ can be factorized as $$ A = UR $$where $U$ is a matrix whose columns are orthogonal vectors $u_{1},u_{2},\ldots,u_{k}$ obtained from Gram-Schmidt process on $w_{1},w_{2},\ldots,w_{k}$, and $R$ is an upper triangular square matrix with diagonal entries all $1$'s, where $$ u_{1} = w_{1}\quad\text{and}\quad u_{k} = w_{k} - \sum_{i=1}^{k-1}\text{proj}_{u_{i}}(w_{k}). $$ **Remark.** The matrix $A$ and $U$ do not need to be square but they are of the same size. Now, if we further normalize each $u_{i}$ column vector in $U$, and multiply the length instead into the diagonal entries of $R$, we have $$ \begin{bmatrix}| & | & & | \\ w_{1} & w_{2} & \cdots & w_{k} \\ | & | & & |\end{bmatrix} = \begin{bmatrix}| & | & & | \\ \frac{u_{1}}{\Vert u_{1}\Vert} & \frac{u_{2}}{\Vert u_{2}\Vert} & \cdots & \frac{u_{k}}{\Vert u_{k}\Vert}\\| & | & & |\end{bmatrix} \begin{bmatrix}\Vert u_{1}\Vert & \ast & \ast & \cdots & \ast\\ 0 & \Vert u_{2}\Vert & \ast & \cdots & \ast\\ 0 & 0 & \Vert u_{3}\Vert & \cdots & \ast \\ \vdots & & & \ddots & \vdots \\ & & & & \Vert u_{k}\Vert\end{bmatrix} $$ This is called > **QR-Factorization.** > Let $A$ be a matrix with linearly independent columns. Then we can factorize $A$ as $$ A = QR $$where $Q$ is a matrix with orthonormal columns and $R$ is an upper triangular matrix whose diagonal entries are the heights of the orthogonal altitudes of the parallelopiped formed by the columns of $A$. **Remark.** Sometimes we compute $R$ by simply taking $R = Q^{T}A$, since as one recalls, if $Q$ is a matrix with orthonormal columns, $Q^{T}Q = I$. A corollary is that > **$k$-dimensional volume of a parallelopiped.** > Let $A$ be an $n\times k$ matrix with linearly independent columns. Then the geometric $k$-dimensional volume of the $k$-dimensional parallelopiped made by the $k$ columns of $A$ in $\mathbb R^{n}$ is given by $$ vol_{k}(A) = \det(R) $$where $A$ has QR-factorization $A = QR$. In particular, one can by-passes the computation of the QR-factorization by exploiting properties of $Q$ and determinants. > **Gramian determinant.** > Let $A$ be an $n\times k$ matrix with linearly independent columns. Then the geometric $k$-dimensional volume of the $k$-dimensional parallelopiped made by the $k$ columns of $A$ in $\mathbb R^{n}$ is given by $$ vol_{k}(A) = \sqrt{\det (A^{T}A)}. $$ $\blacktriangleright$ Proof. Since we can perform QR-factorization on $A$, so $A=QR$ for some matrix $Q$ with orthonormal columns, and $R$ an upper triangular matrix whose diagonals records the heights of the altitudes of the parallelopiped formed by the columns of $A$, we know $\det(R)$ is the volume of the parallelopiped. Now note that $$ \begin{align*} \sqrt{\det(A^{T}A)} &= \sqrt{\det(QR)^{T}(QR)} \\ &= \sqrt{\det(R^{T}Q^{T}QR)} \\ &= \sqrt{\det(R^{T}R)} \\ &= \sqrt{\det(R^{T})\det(R)} \\ &= \sqrt{\det(R)^{2}} \\ &= vol_{k}(A). \quad\blacksquare \end{align*} $$